TITLE OF THE INVENTION 
Distributed document processing. 

FIELD OF THE INVENTION 
The present invention relates to document processing systems in general, and more 
particularly to computer-based document processing systems that employ document imaging and 
optical character recognition. 

BACKGROUND OF THE INVENTION 
Document processing employing Optical Character Recognition (OCR) is well known. 
The OrboCAR line of products, commercially available from Orbograph Ltd., of Yavne, Israel, 
provides optical character recognition of handwritten and printed document elements, primarily for 
the banking industry. While the OrboCAR line of products automatically reads the majority of 
those document elements successfully, the rerraining document elements must be manually keyed 
by clerks. Such manual keying involves high labor costs and management attention. 

SUMMARY OF THE INVENTION 

The present invention seeks to provide a distributed document processing architecture 
that overcomes disadvantages and limitations of the prior art. 

In one aspect of the present invention a method for document processing is provided, 
the method including a) receiving availability profiles from a plurality of personnel operating a 
plurality of remote computers, b) receiving a work order from a remote customer's computer, the 
work order having a time frame within which the work order may be serviced, where any of the 
availability profiles indicates that any of the personnel are available to service the work order 
within the time frame c) receiving within the context of the work order an image of a document 
from the remote customer's computer, d) decomposing the image into a plurality of data entry 
region sub-images, e) providing any of the plurality of data entry region sub-images to the 
available personnel at the remote computers, and f) receiving from each of the plurality of remote 
computers a data entry value associated with at least one of the data entry region sub-images. 
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In another aspect of the present invention any of the steps are performed at a central 
computer that is in communication with any of the remote computers. 

In another aspect of the present invention the providing step e) includes transmitting 
each of the data entry region sub-images together with a unique identifier. 

In another aspect of the present invention the method further includes collating the 
data entry values into a character-based electronic document corresponding to the image. 

In another aspect of the present invention the method further includes providing the 
electronic document to the remote customer. 

In another aspect of the present invention the method further includes performing 
optical character recognition on any of the data entry region sub-images, and the providing step e) 
includes providing if a score related to a result of the performing optical character recognition is 
below a predefined threshold. 

In another aspect of the present invention the method further includes performing 
optical character recognition on any of the data entry region sub-images, thereby resulting in an 
optical character recognition value, comparing the data entry value associated with the data entry 
region sub-image to the optical character recognition value, and where the data entry value and 
the optical character recognition value differ, providing the data entry region sub-image to another 
one of the available personnel to which the data entry region sub-image was not previously 
provided. 

In another aspect of the present invention the providing step e) includes providing at 
least one of the plurality of data entry region sub-images to at least two of the available personnel, 
and where a predetermined number of the data entry values associated with the data entry region 
sub-images are the same, the collating step includes collating one of the predetermined number of 
the data entry values. 

In another aspect of the present invention the method further includes providing in the 
providing step e) at least one of the plurality of data entry region sub-images to a plurality of the 
available personnel, performing optical character recognition on the data entry region sub-image, 
thereby resulting in an optical character recognition value, comparing a plurality of the data entry 
values associated with the data entry region sub-image and the optical character recognition value, 
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and collating in the collating step one of the values from among a predetermined number of the 
values that are the same. 

In another aspect of the present invention the method further includes receiving from 
any of the plurality of remote computers an indicator associated with at least one of the data entry 
region sub-images rejecting the associated data entry region sub-image, and providing to the 
rejecting remote computer an expanded data entry region sub-image that includes the rejected data 
entry region sub-image. 

In another aspect of the present invention the method further includes rating the 
performance of any of the data entry clerks, selecting any of the data entry clerks to service the 
work order whose performance rating equals or exceeds a performance rating specified for the 
work order. 

In another aspect of the present invention the method further includes selecting any of 
the data entry clerks to service the work order who have been pre-approved by the customer. 

In another aspect of the present invention a system for document processing is 
provided, the system including a plurality of availability profiles for a plurality of personnel 
operating a plurality of remote computers, a work order received from a remote customer's 
computer, the work order having a time frame within which the work order may be serviced, 
means for determining whether any of the availability profiles indicates that any of the personnel 
are available to service the work order within the time frame, means for receiving within the 
context of the work order an image of a document from the remote customer's computer, means 
for decomposing the image into a plurality of data entry region sub-images, means for providing 
any of the plurality of data entry region sub-images to the available personnel at the remote 
computers, and means for receiving from each of the plurality of remote computers a data entry 
value associated with at least one of the data entry region sub-images. 

In another aspect of the present invention the system further includes a central 
computer that is in communication with any of the remote computers and that is configured with 
any of the elements mentioned hereinabove. 

In another aspect of the present invention the means for providing is operative to 
transmit each of the data entry region sub-images together with a unique identifier. 
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In another aspect of the present invention the system further includes means for 
collating the data entry values into a character-based electronic document corresponding to the 
image. 

In another aspect of the present invention the system further includes means for 
providing the electronic document to the remote customer. 

In another aspect of the present invention the system further includes means for 
performing optical character recognition on any of the data entry region sub-images, and the 
means for providing is operative to provide if a score related to a result of the performing optical 
character recognition is below a predefined threshold. 

In another aspect of the present invention the system further includes means for 
performing optical character recognition on any of the data entry region sub-images and being 
operative to provide an optical character recognition value, means for comparing the data entry 
value associated with the data entry region sub-image to the optical character recognition value, 
and means for providing the data entry region sub-image to another one of the available personnel 
to which the data entry region sub-image was not previously provided, operative where the data 
entry value and the optical character recognition value differ. 

In another aspect of the present invention the means for providing is operative to 
provide at least one of the plurality of data entry region sub-images to at least two of the available 
personnel, and, where a predetermined number of the data entry values associated with the data 
entry region sub-images are the same, where the means for collating is operative to collate one of 
the predetermined number of the data entry values. 

In another aspect of the present invention the means for providing is operative to 
provide at least one of the plurality of data entry region sub-images to a plurality of the available 
personnel, and the system further includes means for performing optical character recognition on 
the data entry region sub-image, operative to provide an optical character recognition value, and 
means for comparing a plurality of the data entry values associated with the data entry region sub- 
image and the optical character recognition value, and where the means for collating is operative 
to collate one of the values from among a predetermined number of the values that are the same. 

In another aspect of the present invention the system further includes means for 
receiving from any of the plurality of remote computers an indicator associated with at least one of 
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the data entry region sub-images rejecting the associated data entry region sub-image, and means 
for providing to the rejecting remote computer an expanded data entry region sub-image that 
includes the rejected data entry region sub-image. 

In another aspect of the present invention the system further includes a performance 
rating of any of the data entry clerks, and means for selecting any of the data entry clerks to 
service the work order whose performance rating equals or exceeds a performance rating specified 
for the work order. 

In another aspect of the present invention the system further includes means for 
selecting any of the data entry clerks to service the work order who have been pre-approved by 
the customer. 

The disclosures of all patents, patent applications, and other publications mentioned in 
this specification and of the patents, patent applications, and other publications cited therein are 
hereby incorporated by reference in their entirety. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will be understood and appreciated more fully from the 
following detailed description taken in conjunction with the appended drawings in which: 

Fig. 1 is a simplified conceptual illustration of a distributed document processing 
architecture, constructed and operative in accordance with a preferred embodiment of the present 
invention; 

Figs. 2A - 2B, taken together, is a simplified flowchart illustration of an exemplary 
mode of operation of the architecture of Fig. 1, operative in accordance with a preferred 
embodiment of the present invention; 

Figs. 3 A - 3B, taken together, is a simplified flowchart illustration of an exemplary 
mode of operation of the architecture of Fig. 1, operative in accordance with a preferred 
embodiment of the present invention; 

Figs. 4A - 4B, taken together, is a simplified flowchart illustration of an exemplary 
mode of operation of the architecture of Fig. 1, operative in accordance with a preferred 
embodiment of the present invention; 
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Figs. 5 A - 5B, taken together, is a simplified flowchart illustration of an exemplary 
mode of operation of the architecture of Fig. 1, operative in accordance with a preferred 
embodiment of the present invention; 

Fig. 6 is a simplified flowchart illustration of an exemplary mode of operation of the 
architecture of Fig. 1, operative in accordance with a preferred embodiment of the present 
invention; 

Fig. 7 is a simplified flowchart illustration of an exemplary mode of operation of the 
architecture of Fig. 1, operative in accordance with a preferred embodiment of the present 
invention; and 

Fig. 8 is a simplified flowchart illustration of an exemplary mode of operation of the 
architecture of Fig. 1, operative in accordance with a preferred embodiment of the present 
invention. 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 
Reference is now made to Fig. 1, which is a simplified conceptual illustration of a 
distributed document processing architecture, constructed and operative in accordance with a 
preferred embodiment of the present invention. In the architecture of Fig. 1 one or more 
computers 100, such as for use by data entry clerks, are configured for communication with one or 
more computers 102 via a communications medium 104, such as the Internet. Similarly, one or 
more customer computers 106 are configured for communication with computer 102 via 
communications medium 104 or any other suitable communications medium. Any of computers 
106 and/or any of computers 102 may be configured to perform optical character recognition on 
images of documents that include portions that require optical character recognition, such as, but 
not limited to, handwritten portions, and may be otherwise configured to perform portions of any 
of the methods described hereinbelow. 

Reference is now made to Figs. 2A and 2B, which, taken together, is a simplified 
flowchart illustration of an exemplary mode of operation of the architecture of Fig. 1, operative in 
accordance with a preferred embodiment of the present invention. In the method of Figs. 2A and 
2B one or more data entry clerks at one or more of computers 100 provide an availability profile 
to computer 102 via communications medium 104 or any other suitable communications medium. 



The available profile of a data entry clerk preferably indicates the availability of the clerk to 
perform manual data entry tasks at various times, such as by specifying specific dates, days of the 
week, hours of the day, etc. Independently, any of customer computers 106 send one or more 
work orders to computer 102 via communications medium 104 or any other suitable 
communications medium. Each work order preferably indicates a time frame within which a job 
may be serviced, as well as some measure of the magnitude of the job, such as the number of 
documents to be processed. Computer 102 then identifies those data clerks whose availability 
profile indicates that the data entry clerk would be available to work on the job within the 
indicated time frame. 

Together with the work order, or separately therefrom, computer 106 sends to 
computer 102 for processing one or more document images that comprise the job specified by the 
work order. Computer 102 then decomposes each image into one or more data entry region sub- 
images using conventional techniques, where each sub-image includes an element that requires 
interpretation or recognition, now referred to as a recognition element, such as, but not limited to, 
a handwritten element. Computer 102 then provides each sub-image to one or more available data 
entry clerks at one or more computers 100, typically together with a unique identifier identifying 
the sub-image. The data entry clerk then views the sub-image, keys in a data entry value from the 
characters appearing in the recognition element of the sub-image, and transmits the data entry 
value to computer 102, typically together with the unique identifier where provided. For each 
document image, computer 102 collates data entry values received from data entry clerks into a 
character-based electronic document corresponding to the document image. Optionally (optional 
steps are shown in dashed lines), where a single sub-image is provided to more than one data entry 
clerk who each provide an associated data entry value, if a predetermined number of these values 
are the same, the matching result may be selected for collation as indicated above. 

Reference is now made to Figs. 3A and 3B, which, taken together, is a simplified 
flowchart illustration of an exemplary mode of operation of the architecture of Fig. 1, operative in 
accordance with a preferred embodiment of the present invention. The method of Figs. 3A and 
3B is similar to the method of Figs. 2A and 2B except as is now noted. Unlike the method of 
Figs. 2A and 2B, computer 106 decomposes each document image into one or more data entry 
region sub-images using conventional techniques, where each sub-image includes a recognition 
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element. Computer 106 then provides each sub-image to computer 102, typically together with a 
unique identifier identifying the sub-image. Computer 102 then provides each sub-image to one or 
more available data entry clerks at one or more computers 100, typically together with the unique 
identifier identifying the sub-image. The data entry clerk then views the sub-image, keys in a data 
entry value from the characters appearing in the recognition element of the sub-image, and 
transmits the data entry value to computer 102, typically together with the unique identifier where 
provided. For each document image, computer 102 collates data entry values received from data 
entry clerks into a character-based electronic document corresponding to the document image. 
Alternatively, computer 102 may forward the data entry values to computer 106 which then 
collates the data entry values into a character-based electronic document corresponding to the 
document image. In either case, where a single sub-image is provided to more than one data 
entry clerk who each provide an associated data entry value, if a predetermined number of these 
values are the same, the matching result may be selected for collation as indicated above. 

Reference is now made to Figs. 4 A and 4B, which, taken together, is a simplified 
flowchart illustration of an exemplary mode of operation of the architecture of Fig. 1, operative in 
accordance with a preferred embodiment of the present invention. The method of Figs. 4A and 
4B is similar to the method of Figs. 2A and 2B except as is now noted. As in the method of Figs. 
2A and 2B computer 102 or 106 decomposes each image into one or more data entry region sub- 
images using conventional techniques, where each sub-image includes a recognition element. 
However, each sub-image is then subject to conventional OCR processing. Where the result of 
OCR processing for a sub-image indicates that the sub-image was not successfully processed, or if 
a confidence rating related to a result of said performing optical character recognition is below a 
predefined threshold, computer 102 provides the sub-image to one or more available data entry 
clerks at one or more computers 100. The data entry clerk then views the sub-image, keys in a 
data entry value from the characters appearing in the recognition element of the sub-image, and 
transmits the data entry value to computer 102. For each document image, computer 102 collates 
successful OCR results and data entry values received from data entry clerks into a character- 
based electronic document corresponding to the document image. 

Reference is now made to Figs. 5A and 5B, which, taken together, is a simplified 
flowchart illustration of an exemplary mode of operation of the architecture of Fig. 1, operative in 
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accordance with a preferred embodiment of the present invention. The method of Figs. 5A and 
5B is similar to the method of Figs. 2A and 2B except as is now noted. As in the method of Figs. 
2A and 2B computer 102 or 106 decomposes each image into one or more data entry region sub- 
images using conventional techniques, where each sub-image includes a recognition element. 
However, each sub-image is then subject to conventional OCR processing. Computer 102 then 
provides each sub-image to one or more available data entry clerks at one or more computers 100. 
The data entry clerk then views the sub-image, keys in a data entry value from the characters 
appearing in the recognition element of the sub-image, and transmits the data entry value to 
computer 102. Computer 102 then compares the OCR value for each sub-image with the data 
entry value for the same sub-image. Computer 102 then collates those data entry values and OCR 
values that have been "validated," i.e., where the values are the same, into a character-based 
electronic document corresponding to the document image. Where the data entry value and the 
OCR value for a sub-image differ, computer 102 provides the sub-image to at least one other 
available data entry clerk at computer 100 to whom the sub-image was not previously provided. 
As before, the data entry clerk then views the sub-image, keys in a data entry value from the 
characters appearing in the recognition element of the sub-image, and transmits the data entry 
value to computer 102. Computer 102 then compares the OCR value for the sub-image with each 
of the data entry values for the same sub-image. If a predetermined number of the values are the 
same, the matching result is considered to be the verified result, which computer 102 then collates 
as indicated above. Alternatively, computer 102 then compares each of the data entry values for 
the same sub-image, and considers a value to be verified only if a predetermined number of data 
entry values are the same. 

Reference is now made to Fig. 6, which is a simplified flowchart illustration of an 
exemplary mode of operation of the architecture of Fig. 1, operative in accordance with a 
preferred embodiment of the present invention. The method of Fig. 6 may be used in conjunction 
with any of the methods described herein. In the method of Fig. 6, computer 102 provides a sub- 
image to one or more available data entry clerks at one or more computers 100. The data entry 
clerk then views the sub-image, and, if the sub-image is unclear, the data entry clerk may send to 
computer 102 an indicator associated with the sub-image rejecting the sub-image, whereupon 
computer 102 may send to the data entry clerk an expanded data entry region sub-image that 
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includes said rejected data entry region sub-image. The expanded sub-image may include more 
area of the document image and/or may be magnified using conventional techniques. 

Reference is now made to Fig. 7, which is a simplified flowchart illustration of an 
exemplary mode of operation of the architecture of Fig. 1, operative in accordance with a 
preferred embodiment of the present invention. The method of Fig. 7 may be used in conjunction 
with any of the methods described herein. In the method of Fig. 7 a data entry clerk is given a 
performance rating using any known rating technique. The rating may be based on past 
performance and/or based on performance given a predefined set of training images whose values 
are known. When selecting a data entry clerk for work on a particular job, only those data entry 
clerks whose performance rating equals or exceeds a performance rating specified by the system 
administrator or by the customer in a work order may be selected to work on the job. Similarly, as 
is shown with particular reference to Fig. 8, when selecting a data entry clerk for work on a 
particular job, only those data entry clerks who have been pre-approved by the system 
administrator or by the customer, such as by specifically identifying the clerk or by pre-approving 
clerks according to specific attributes such as, but not limited to, qualification level, geographic 
location, or organizational association, may be selected to work on the job. 

It is appreciated that one or more of the steps of any of the methods described herein 
may be omitted or carried out in a different order than that shown, without departing from the true 
spirit and scope of the invention. 

While the present invention has been described with reference to one or more specific 
embodiments, the description is intended to be illustrative of the invention as a whole and is not to 
be construed as limiting the invention to the embodiments shown. It is appreciated that various 
modifications may occur to those skilled in the art that, while not specifically shown herein, are 
nevertheless within the true spirit and scope of the invention. 
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